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^ There is a large discrepancy in our understanding of uncapacitated and capacitated versions 

of network location problems. This is perhaps best illustrated by the classical fc-center problem: 
there is a simple tight 2-approximation algorithm for the uncapacitated version whereas the first 
f^ constant factor approximation algorithm for the general version with capacities was only recently 

^~H obtained by using an intricate rounding algorithm that achieves an approximation guarantee in 

^_^ the hundreds. 

rr^ Our paper aims to bridge this discrepancy. For the capacitated fc-center problem, we give 

/^ a simple algorithm with a clean analysis that allows us to prove an approximation guarantee 

O 



of 9. It uses the standard LP relaxation and comes close to settling the integrality gap (after 
necessary preprocessing), which is narrowed down to either 7, 8 or 9. The algorithm proceeds by 
first reducing to special tree instances, and then solves such instances optimally. Our concept 
of tree instances is quite versatile, and applies to natural variants of the capacitated fc-center 
problem for which we also obtain improved algorithms. Finally, we give evidence to show 
rf-^ that more powerful preprocessing could lead to better algorithms, by giving an approximation 

OO algorithm that beats the integrality gap for instances where all non-zero capacities arc uniform. 
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■^^ Keywords: approximation algorithms, capacitated network location problems, capacitated 

^^ fc-center problem, LP-rounding algorithms. 
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1 Introduction 

Network location problems lie at the heart of combinatorial optimization. The question of study is 
how to select centers so as to best serve a given set of clients located in a metric space. One can 
imagine several objective functions to measure the quality of service. Perhaps the most natural and 
well-studied ones are "social welfare" , where we wish to minimize the average distance from a client 
to its assigned center, and "fairness", in which we wish to m,inim,ize the m,axim,um, distance from 
a client to its assigned center. Note that, once we have selected the centers, both these objectives 
are minimized by assigning each client to its closest center. An inherent drawback of this strategy, 
however, is that it is unable to deal with centers of (different) capacities that limit the amount of 
clients they can serve, which is a constraint present in most conceivable applications. In fact, these 
innocent looking capacity contraints have troubled researchers for decades and they have a much 
bigger impact on our understanding than the choice of objective function. 

For uncapacitated network location problems, several beautiful algorithmic techniques, such 
as LP-rounding [6j , primal-dual framework [15] and local search |17| [5] have been used to obtain 
a fine-grained understanding of the approximability of the classic variants: fc-center, /c-median, 
and facility locatiorR Already in the 80's, Gonzales [10] and Hochbaum & Shmoys [13j developed 
tight 2-approximation algorithms for the /c-center problem. For facility location, the current best 
approximation algorithm is due to Li [19] . He combined an algorithm by Byrka [4J and an algorithm 
by Jain, Mahdian, and Saberi [13] to achieve an approximation guarantee of 1.488. This is nearly 
tight, as it is hard to approximate the problem within a factor of 1.463 [llj. The gap is slightly larger 
for A;-median: a recent LP rounding [20j achieves an approximation guarantee of 1 -|- ^/3 ~ 2.732 
improving upon a local search algorithm by Arya et al. [Ij; and it is NP-hard to do better than 
l-|-2/e ~ 1.736 fT4J. Although the different problems have algorithms with different approximation 
guarantees, they share many techniques, and improvements have often come hand in hand. In 
particular, most of the above progress relies on standard linear programming (LP) relaxations. 

In contrast, the standard LP relaxation fails to give any guarantees for capacitated network 
location problems leading to a much coarser understanding. Apart from special cases, such as 
uniform capacities |li6) . soft capacities (a center can be opened several times) [221 [TB] I15j . and 
other variants [18^18]. the only known constant factor approximation algorithm until recently, was 
for facility location. In a sequence of works, including Korupolu, Plaxton &: Rajaraman [17J, Pal, 
Tardos & Wexler [21] , and Chudak & Williamson [7] , increasingly enhanced local search algorithms 
culminated in an approximation guarantee of 5 [2]. Their methods are elegant but specialized to 
facility location and are not LP-based. In fact, finding a relaxation-based algorithm for capacitated 
facility location with a constant approximation guarantee remains a major open problem (see e.g. 
"Problem 5" of the ten open problems from the recent book by Williamson and Shmoys [23]). 
One of the motivations for finding algorithms based on relaxations is that those methods are often 
flexible and the developed techniques transfer to different settings, as has indeed been the case in 
the study of uncapacitated location problems. 

In the quest to obtain a better understanding and more general (relaxation based) techniques for 
capacitated network location problems, it is natural to start with the capacitated /c-center problem. 

^Recall that in fe-center and fc-median, we wish to select k centers so as to minimize the fairness and social welfare, 
respectively; facility location is similar to fc-median but instead of having a constraint k on the number of centers to 
open, each center has an opening cost. 



Indeed, even though we have a good understanding of uncapacitated location problems in general, 
the uncapacitated /c-center problem stands out, with an extremely simple greedy algorithm that 
gives a tight analysis of the LP relaxation. Our failure to understand the capacitated fc-center 
problem is therefore solely due to the lack of techniques for analyzing capacity constraints. An 
important recent development in this line of research is due to Cygan, Hajiaghayi and Khuller [9], 
who obtain the first constant factor approximation for the capacitated fc-center problem. Their 
algorithm works by preprocessing the instance to overcome the unbounded integrality gap of the 
natural LP relaxation, followed by an intricate rounding procedure. The approximation factor is 
not computed explicitly, but is estimated to be roughly in the hundreds. This however, is still 
quite far off from the integrality gap of 7 (after preprocessing) [^ and the inapproximability results 
which rule out a factor better than 3 (see e.g. [9] for a simple proof). 

In this paper, we develop novel techniques to further close the gap in our understanding of 
capacitated location problems. In particular, we present a simple algorithm for the capacitated 
A;-center problem with a clean analysis that allows us to prove an approximation guarantee of 9. 
Our result is based on the standard LP relaxation and it almost settles its integrality gap (after the 
preprocessing of Cygan et al. [9j): it is either 7, 8 or 9 (both the integrality gap and approximation 
ratio can only take integral values; this is because the worst instances can easily be seen to be ones 
defined by the shortest-path metric on an unweighted graph). We next describe this and our other 
results in greater detail. Due to the simplicity of our analyses, we hope that some of the ideas could 
be applied to other location problems, such as capacitated fe-median, for which no constant factor 
approximation algorithms are known. 

Our main results and proof outline. Our main algorithmic result is the following. 
Theorem 1. There exists a 9 -approximation algorithm for the capacitated k-center problem. 

Our algorithm takes a guess r on the optimal solution value, and considers an unweighted graph 
G<T- on the given set of vertices where two vertices are adjacent if and only if their distance is at 
most r: this graph represents which assignments are "admissible" with respect to r. We solve the 
standard LP on this graph, which can be assumed to be connected |9]. This determines if it is 
possible to (fractionally) open k vertices while assigning every vertex to a center that is adjacent in 
G<r- If this LP is infeasible, we know that the optimum is worse than r; otherwise, our algorithm 
will find a solution where every vertex is assigned to a center that is within a distance of 9 in G<r, 
leading to a 9-approximation algorithm. 

The LP solution specifies a set of opening variables that indicate the fraction to which each 
vertex is to be opened. Our algorithm rounds these opening variables by "transferring" openings 
between vertices to make them integral. Since we do not create any new opening, our rounding 
will naturally open at most k centers; however, the challenge is to ensure that there exists a small- 
distance assignment of the vertices to open centers. If, for example, the opening of a vertex v is 
transferred to another vertex that is far away, the clients that were originally assigned to v may be 
unable to find an available center nearby. For another example, if the opening of a high-capacity 
vertex gets transferred to a low-capacity one, the low-capacity vertex may fail to provide sufficient 
capacity to cover the vertices in the neighborhood. Thus, we need to ensure that our rounding 
algorithm transfers openings only in small vicinity, and that "locally available capacity" of the 
graph does not decrease. (Definition p] formalizes this concept as a distance-r transfer.) 



We reduce the rounding problem to the special case of tree instances, and present an algorithm 
that rounds such instances optimally. A tree instance is given by a set of opening variables defined 
on a rooted tree, where every non-leaf node has an opening variable of 1. Tree instances are 
generalizations of caterpillars used by Cygan et al. [^, which can be considered as tree instances 
whose non-leaf nodes form a path and have certain degree bounds. Suppose we have a tree instance 
where the capacities are uniform and there are exactly two leaves u and v each of which is opened by 
1/2, whereas every other vertex is opened by 1. If u and v are distant, this may appear problematic 
at a glance as we cannot transfer the opening of one to the other. However, there exists a (unique) 
path u,wi, . . . ,Wm,v in the tree, and we can transfer the opening of 1/2 in a "chain" along this 
path: from u io wi, from wi to W2, ■ ■ ■, from Wm to v. This idea can in fact be carried through to 
give an algorithm for capacitated /c-center when all capacities are equal. 

Unfortunately, this chain of transfers causes a problem when the capacities are given arbitrarily: 
suppose in the previous example that u and v have very high capacities compared to the others. 
Then we will not be able to transfer the opening of u to wi, since the open centers around u may 
not be able to provide sufficient capacity to cover the vertices that were originally assigned to u. 
However, from another angle, wi (or any other non-leaf vertex) is "wasting" the budget, since it 
opens a center while contributing relatively small capacity to the graph. This provides us some 
"slack" in the budget that we can utilize: in this particular example, by transferring an opening of 
1/2 from wi to u, and the other 1/2 from wi to u in a chain, we can successfully round the given 
instance thanks to the decision of closing wi which had originally had its opening variable equal to 
one. This strategy of closing a fully open center is quite powerful, yet we need to ensure that its 
capacity can be accomodated by nearby centers if we want to close it. Thus, the viability of such 
a strategy tends to depend on several factors, including how its capacity compares to vertices in 
the neighborhood, which of these vertices are to be opened, and so on - all decisions which could 
depend on more and more distant vertices. 

In contrast, our algorithm departs from previous works by using a simple local strategy that 
does not depend on distant vertices and applies to every non-leaf node. The reason our strategy 
works locally is that the decision of closing fully open centers is determined using solutions to 
subinstances, which are solved recursively. This key idea significantly eases the analysis and leads 
to our optimal algorithm for tree instances. The simplicity of our analysis also helps us more 
carefully analyze the approximation ratio and extend our techniques to related problems. Section [4] 
formally presents our algorithm to round a tree instance; Appendix [^ presents the extensions to 
two related problems: the capacitated k-supplier problem and the budgeted opening problem with 
uniform capacity. 

Section [3] presents our reduction to tree instances. We construct a tree instance on a subset of 
vertices that are chosen as "candidates" to be opened. Non-leaf nodes will be carefully chosen, in 
order to yield a 9- approximation algorithm. Two adjacent vertices in the constructed tree instance 
will not necessarily be adjacent in the original graph, but will be in close proximity; hence, if the 
tree instance can be rounded using short transfers of openings, the original instance can also be 
rounded using only slightly longer transfers. 

More results and future directions. In Section |6} we explore future directions towards a 
better understanding of the problem. Recall that our algorithm proceeds in three steps: firstly, we 
preprocess the given instance using the results of Cygan et al. ^; secondly, we reduce the problem 



to a tree instance; lastly, we solve this tree instance. Given that our tree rounding algorithm is 
best-possible, it is natural to seek to improve the first two steps. The preprocessing step of Cygan 
et al. allows us to bring down the integrality gap from unbounded to 9; however, the integrality 
gap after the basic preprocessing is known to be at least 7 [9|, which is larger than the best 
known inapproximability result that rules out a better factor than 3. The instance showing the 
integrality gap of 7 (and also that of the inapproximability result) has a special structure that every 
capacity is either or L for some constant L. In order to understand the potential of stronger 
preprocessing methods, we investigate this {0,L}-case and show that additional preprocessing 
and a sophisticated rounding gives a 6- approximation algorithm. The interesting fact is that we 
obtain an approximation ratio which surpasses the integrality gap lower bound of 7 after basic 
preprocessing. This raises the natural open question: could there be preprocessing steps which 
bring the approximation ratio down to 3? We could also ask: do lift-and-project methods (applied 
to a potentially different formulation) automatically capture these preprocessing steps? We believe 
that understanding these questions would also shed light on approximating capacitated versions of 
other problems such as facility location and fc-median. 

2 Preliminaries 

Given an integer k and a metric distance/cost c : V x V ^ M+ on V with a capacity function 
L : y — 7- Z>o, the capacitated k-center problem is to choose k vertices to open, along with an 
assignment of every vertex to an open center which minimizes the longest distance between a 
vertex and the center it is assigned to while honoring the capacity constraints: i.e., no open center 
V is assigned more vertices than its capacity L{v). 

For an undirected graph G = {V,E), dG{u,v) denotes the distance between u,v ^ V; A'^jt(n) 
denotes the set of vertices in the neighborhood of u, including u itself: Nq{u) := {v \ {u,v) G 
E} U {u}. For [/ C y, dcivjU) denotes the distance from v to U: dG{v,U) := uiinufzi/ dG{v,u). 
Nq{U) is a shorthand for Uu^u^ci'^)' When the graph of interest G is clear from the context, we 
will use d and N~^ instead of da and A^^, respectively. Let OPT denote the optimal solution value. 

Reduction to an unweighted problem using the standard LP relaxation. Our algorithm 

begins with determining a lower bound r* on the optimal solution value: it makes a guess r at 
OPT, and tries to decide if r < OPT. We simplify this problem by considering an unweighted graph 
that represents which assignments are "admissible". Let G<r = (^)-£'<r) be the unweighted graph 
on V (with loops on every vertex) where two vertices are adjacent if and only if their distance is 
at most r: E<t- := {{u,v) \ c{u,v) < r}. Note that a feasible solution of value r assigns every 
vertex to a center that is adjacent in G<r, and conversely, if a solution assigns every vertex to a 
center that is adjacent in G<r, its value is no greater than r. For an unweighted graph G = {V, E), 
the standard LP relaxation LPjt(G) is the following feasibility LP that fractionally verifies whether 



there exists a solution that assigns every vertex to an open center that is adjacent in G: 
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Xu^ is called an assignment variable; yu is called the opening variable of u. 

However, the integrality gap of this LP, defined as the maximum ratio — — where LPfc(G<T-) 
is feasible, is unbounded; hence this LP cannot in general estimate OPT very well. We use the 
approach of Cygan et al. |9j to address this issue: consider the connected components of G<t-; if 
r > OPT, a vertex can be assigned only to the vertices in the same connected component. For 
each connected component Gi of G<t-, the algorithm decides the minimum value of ki for which 
LPfc. (Gj) is feasible; if Yli^i > ^i this certifies that there exists no solution of value r or better 
(r < OPT). Now let r* be the smallest r for which the algorithm fails to certify that r < OPT; 
since the algorithm will not be able to provide a certificate for r = OPT, we have r* < OPT. 
The algorithm then separately solves the subproblems given by the connected components of G<r* '■ 
given a connected graph G for which LPfc(G) is feasible, our algorithm finds a set of k vertices to 
open, with an assignment of every vertex to an open center that is within the distance of nine. 
Note that dc^ ,(m, w) < 9 implies c{u,v) < 9r* < 9 • OPT from the triangle inequality. 

Lemma 2 (Cygan et al. [9j). Suppose there exists an algorithm that, given a connected graph 
G, capacity L, and k for which LPfc(G) is feasible, computes a set of k vertices to open and an 
assignm,ent of every vertex u to an open center v such that d{u, v) < p and the capacity constraints 
are satisfied. Then we can obtain a p- approximation algorithm for the capacitated k-center problem. 



Distance-r transfers. The above discussion reduces the task of designing an approximation 
algorithm for the capacitated A:-center problem to that of using a solution {x,y) to LPfe(G) in 
order to select k centers so that each vertex in the connected graph G is assigned to a center 
in a nearby neighborhood. Simple algebraic manipulations show that the LP solution satisfies 
\U\ = T.u&uT.w:{w,u)(iE^wu < Y.w(^N+iU)^i'^) ' V^' 1^°*^ that, if the opening variables y are 
integral, this exactly corresponds to Hall's condition [T2] and hence we can assign every vertex to 
an adjacent center. However, the LP solution may open each center only by a small fractional 
amount; in order to obtain an integral solution, it is therefore natural to try to aggregate fractional 
openings of nearby vertices. As different centers have varying capacities, one difficulty of this 
approach is that the rounding also needs to ensure that the aggregation does not decrease the 
available capacity. Consider a center u of capacity L(u) that is open with fraction y^, we can view 
it as a center with the fractional capacity of L{u) ■ yu, because in a sense this is the maximum 
number (as a fraction) of vertices this center serves according to the LP. Our rounding procedure 
will open k centers, while ensuring that we can transfer the fractional capacity of each u to one or 



more of the open centers that are close by (and the performance guarantee is determined by how 
close these centers are). The following definition formalizes the notion of a distance-r transfer: 

Definition 3. Given a graph G = {V, E) with a capacity function L : V ^ Z>o and y £ M.^, a 
vector y' G R^ is a distance-r transfer of {G, L, y) if 

&^- E^,ey y'v = T.vdV Vv '^^d 

^)-- E,:div,u)<rL{v)y', > EueuL{u)yu for all U CV. 

If y' is the characteristic vector of S QV, we say S is a distance-r transfer of {G,L,y). 

The given conditions say that a transfer should not change the total number of open centers, 
while ensuring that the total fractional capacity in each small neighborhood does not decrease as a 
result of this transfer. We also remark that multiple transfers can be composed: if y' is a distance-r 
transfer of (G, L, y) and y" is a distance-r' transfer of (G, L, y') then y" is a distance-(r-l-r') transfer 
of(G,L,y). 

Lemma 4. For a graph G = iy,E) with a capacity function L : F — )■ Z>o, let {x,y) be a feasible 
solution to LPfc(G). If S (^ V is a distance-r transfer of {G,L,y), then every vertex v £V can be 
assigned to a center s £ S such that dG{v,s) < r -|- 1, while ensuring no center is assigned more 
vertices than its capacity. Moreover, \S\ = k, and this assignment can be found in polynomial time. 

Proof. Consider the natural bipartite matching problem between V and the multiset of open cen- 
ters that are duplicated to their capacities: i.e, each center s G S appears in the multiset with 
multiplicity L(s). Every vertex t> in 1/ is connected to every copy of each center s € S such that 
d{v,s) < r -\- 1. Observe that a matching of cardinality \V\ naturally defines an assignment that 
satisfies the desired properties. We shall now show that there exists such a matching by verifying 
Hall's condition, i.e., that for all [/ C 1/, \U\ < J2seS-dc(s U)<r+i ^i^)- 



As was observed earlier, we have \U\ < ^w-da(wU)<i'^(''^) ' yw\ from Condition (c 3), \U\ < 
T.w:dG(w,u)<i -^("^) ■ y-^ - T.s(^S:dG{s,m<r+i ^(^)- ^his matching can be found in polynomial time. 



and \S\ = k follows from Condition (J 3,). D 



Tree instances. As was discussed earlier, we solve the general problem via reduction to tree 
instances. 

Definition 5. A tree instance is defined as a tuple {T,L,y), where T = {V,E) is a rooted tree with 
the capacity function L : V ^ Z>o, o-nd opening variables y G (0,1]^ satisfy that Ylv(^vy^ ^-^ "'^ 
integer and yv = ^ for every non-leaf node v G V. 

3 Reducing General Instances to Trees 

In this section, we present the reduction from the capacitated fc-center problem to tree instances. 

Lemma 6. Suppose there exists a polynomial-time algorithm that finds an integral distance-r trans- 
fer of a tree instance. Then there exists a (3r -|- 3) -approximation algorithm for the capacitated 
k-center problem. 

6 



Lemma [6] directly follows from Lemmas [2| |4j and [7} 

Lemma 7. Suppose there exists a polynomial-time algorithm, that finds an integral distance-r trans- 
fer of a tree instance. Then there exists an algorithm that, given a connected graph G = {V,E), 
capacity L : 1/ — )• Z>o, and k €N for which LPfc(G) has a feasible solution (x, y), finds an integral 
distance-{3r + 2) transfer of (G, L, y) . 

Our reduction, conceptually, constructs a tree instance by defining a tree on a subset of the 
vertices that have nonzero opening variables in the LP solution. Adjacent vertices in this tree 
instance may not necessarily be adjacent in G, but will be in close proximity; this establishes that 
a distance-r transfer of the tree instance can be interpreted as a transfer of short distance in G as 
well. The opening variables of this tree instance would ideally be set equal to the corresponding 
LP opening variables. However, recall that one of the crucial characteristics of tree instances is 
that every internal node has the opening variable of one. Yet, individual opening variables of the 
LP solution may have values less than one in general; we address this issue by using the clustering 
due to Khuller and Sussman p!Hj . 

Lemma 8 (Khuller and Sussman [E]). Given a connected graph G = (V, E), V can he partitioned 
into {C^}i,gr for some set of cluster midpoints F C y, such that 

• there exists a tree U = (T, F) rooted at r £T such that for every {u, v) G F, dciu, v) = 3; 

• for all f E r, Nq{v) C C^; and 

• for all u G G^, dciu^v) < 2. 

Observe that, for every cluster C^, the total opening in the neighborhood of v is at least one: 
^ui^N+(v) y^ ^ Ylui^N'^(v) ^™ ~ ^ from the LP constraints. We will aggregate these openings to 
create at least one vertex with the opening variable of one in each cluster; then each cluster will 
contribute one "fully open vertex" to the tree instance, which will become the non-leaf nodes of 
the tree. Two non-leaf nodes in the tree instance are made adjacent if and only if their clusters are 
adjacent in U . In order to ensure that the aggregation retains the fractional capacity in the graph 
(in other words, to satisfy Condition [ci) of Definition 3L we will transfer the openings in Nq{v) 



to a vertex with the highest capacity in Nq{v). Let m^ := argmax^ ^+/ -,L(u) denote this vertex. 



If rriu and m,„ are adjacent in this tree instance, how far can they be in Gl Recall that rriu and m^ 
are adjacent if and only if (u, v) G F; hence, dc{mu, m^) < dciruu, u) + dciu, v) + dciv, m^) < 5. 
However, here comes a subtlety: if m^ and m^ are also adjacent in the tree, we would expect 
dci'iTT'ujrnw) < dG{mu,my) +^0(771^,771,^) < 10, whereas a tighter bound shows that dG{rnu,mw) 
in fact never exceeds 8: dciruu, ruw) < dcirnu, u) +dG{u,v)->rdG{v,w) + dG{w,mw) < 1 + 3 + 3 + 1. 
Therefore, a simple abstraction that a tree edge corresponds to a length-5 path in G would lead to 
a slight slack in the analysis. In order to avoid this issue, we will create an auxiliary vertex a^ that 
is "almost at the same position" as the cluster midpoint v for each cluster, and aggregate openings 
to this auxiliary vertex a^ instead of 771-^ as we did earlier. We will treat a^ as the delegate for m^, 
in the sense that a^ (in lieu of m^) will be part of our tree instance, and if we decide to open a^ 
from the tree instance, we will open ruy instead. 

Proof of Lemma [^ We first augment the graph by introducing the auxiliary vertices (see also Fig- 
ure [I]) : for each Cy , we add a new vertex a^ to the graph, along with the edges from a^ to every 



Figure 1: Graph G obtained by augmenting G with auxiUary vertices; black nodes correspond to 
cluster midpoints, dashed circles represent their neighborhoods. 



vertex in Nq{v). Let G = {V,E) be this augmented graph. Observe that a^ is located "almost at 
the same position" as v in the following sense: for every u G y, dQ{u^ a„) = dciu, v) unless u = v; 
dQ{v, ay) = 1. Note that dQ^a^, ciz) = dciw, z). L and y are accordingly augmented by setting the 
capacity and the opening variable of the new auxiliary vertex respectively as L{av) '■= L{mv) and 

Now our reduction works in three phases: in the first phase, we aggregate the opening of 1 
from Nq{v) to a^; this phase yields a distance-1 transfer y^"^^ of {G,L,y). In the second phase, 
we construct a tree instance by defining a tree on a subset of V, and invoke the polynomial-time 
algorithm to find an integral distance-r transfer of this tree instance. We will see that this transfer 
can be interpreted as a distance-3r transfer y=^™"° of {G, L, y ^^^). In the last phase, we transfer the 
opening of each auxiliary variable a„ to the vertex it delegates, m„. This constitutes a distance-1 
transfer y^hird ^f (^^ ^^ysecond)_ 

The opening aggregation in the first phase works as follows: for each cluster C^,, we increase 
ya^ while simultaneously decreasing y„ for some u G ^civ) with |/„ > 0. If ya^ reaches one, we 
stop; if Hu reaches zero, we find another u G Nq{v). The initial choice of u is always taken as 
rriv so that this procedure ensures that ym„ becomes zero. The procedure outputs a distance-1 
transfer y ^^*, since whenever an opening variable decreases during the construction, we increase 
the opening variable of an adjacent vertex with higher or equal capacity. 

In the second phase, we define a tree T on the set of vertices with nonzero opening variables. 
Note that this in particular implies that m^ ^ T for each cluster G^. T is constructed from 
U = (r, F) as follows: we replace each f G F by a^, to obtain a tree on the auxiliary vertices, and 
for each vertex u G Cv such that yu > 0, we attach n as a (leaf) child of a^,. Note that every non- 
leaf node is an auxiliary vertex and therefore has the opening variable of one. The total opening 
is equal to the total opening of y, and therefore (T, L, y ^^^) is a valid tree instance; we invoke 
the polynomial-time algorithm to find an integral distance-r transfer of this instance. For any two 
nodes i and j that are adjacent in this tree instance, either i = Uu and j = a^ for some (u, v) G F, 
or i = tty and j G Gy. In the former case, dQ{i,j) = 3; in the latter case, dQ{i,j) < 2. Thus, 
the integral distance-r transfer of the tree instance can be interpreted as an integral distance-3r 
transfer y'^^°"'^ of {G , L , y^'"^^) . 

Note that yml°"^ = ^ot every cluster G^, since rriv does not participate in the tree instance; on 
the other hand, a^ may have been opened by the tree algorithm. In the last phase, we transfer the 
opening of a^ to m^, the vertex delegated by a^,. This yields an integral distance-1 transfer y*"'™ 
of (G',L,|/^^™"^). 



Note that y^"^^ = for every cluster C^; by projecting y*^""^ back to V, we obtain an integral 
distance-(3r + 2) transfer of {G, L,y). D 

4 Algorithm for Tree Instances 

In this section we prove the following. 

Lemma 9. There is a polynomial time algorithm that finds an integral distance-2 transfer of a 
given tree instance (T,L,y). 

We remark that it is easy to see that some tree-instances do not admit an integral distance- 1 
transfer and the above lemma is therefore the best possible. One example is the following: the 
instance consists of a root with six children, where each child is opened with a fraction 2/3, and 
all vertices have the same capacity; it is easy to see that any integral solution needs to transfer 
fractional capacity from one leaf to another (i.e., of distance 2). We now present the algorithm 
along with the arguments of its correctness. 

The algorithm builds up the solution by recursively solving smaller tree instances. The base case 
is simple: if |r| < 1 then simply open the vertex in V{T) if any. By the integrality of ^y^y(T) Vv 
this is clearly a distance-2 transfer (actually a distance-0 transfer). Let us now consider the more 
interesting case when |r| > 2; then there exists a node r of which every child is a leaf. Let f i, . . . , f^ 
be the children of r, in the non-increasing order of capacity: L{vi) > • • • > L{vi). Let T^ denote 
the subtree rooted at r and Y := X]j=i Vv^- The algorithm considers two separate cases depending 
on whether Y is an integer. 

Let us start with the simpler case when Y is an integer: the algorithm selects the set Sr consisting 
of the y -|- 1 vertices of highest capacity in T^. As every pair of nodes in T^ are within a distance 
of 2, Sr is a distance-2 transfer of the tree instance induced by T^. The algorithm then solves the 
tree instance induced by T := T \ T^ to obtain a distance-2 transfer S of size 'YIvi^tV'" — Y — ^- It 
follows that S := Sr^ S \s a distance-2 transfer of (T, L, y). 

We now consider the final more interesting case when Y is not an integer. In this case, we 
cannot consider T^. and T\Tr as two separate instances because the y- values suggest to either open 
[Y\ -|- 1 or \Y~\ + 1 centers in T^: a choice that depends on the selected centers in T\Tr. As at least 
\Y\ -|- 1 of the vertices in Tr will be selected as centers in either case, the algorithm will naturally 
commit itself to open the [yj -|- 1 vertices in Tr of highest capacity. Let S'commit denote that set 
and note that it equals {vi, . . . , v^y\ ) i"} or {vi, . . . , v^y\ ) ^[yj+i} dependent on which node of r and 
wiyj_i_i has highest capacity (viyj+i is well defined since we have that the number of children i is 
at least \Y~\ from y < 1). By the selection of ^commit, we have 

^ yuLiu)< Y. L{s)+ypL{p), (1) 

where yp = Y — \Y\ and L[p) = min[L(r), L{v\y\^i)\. In other words, if the algorithm on the one 
hand chooses to only open the \Y\ + 1 centers S'commit in Tr, then an additional fractional capacity 
ypL{p) needs to be transferred from Tr to an open center in T \Tr. On the other hand, if the 
algorithm chooses to open all the centers \Y~\ -|- 1 in S'commit U {fLyj_|_i,r} then those centers can 
accomodate all the fractional capacity in Tr together with (1 — yp)L{p) additional capacity. 
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L(p) = mm[L{r),L{viY}+i)] 
Y-[Y\ 



{a) 




(b) 



Figure 2: (a) The construction of T from T with the subtree Tr rooted at r with children vi and 
V2; the grey vertices are those selected in potential solutions to T and T, respectively, (b) The 



bipartite graph and the induced subgraphs G and Gr that are used in the proof of Claim 10 



We defer this decision to be based on the solution of the smaller tree instance (T, L, y) obtained 
from {T,L,y) as follows (see also Figure^): replace T^. by the vertex p that represents the de- 
ferred decision and let y, L be the natural restrictions of y, L on T \ Tr with yp = Y — [Y\ and 
L{p) = min[L(r), L(T;iyj_|_x)]. The algorithm then recursively solves this smaller instance to obtain 
a distance-2 transfer S of T. From S it constructs the solution S to the original problem instance 
by first replacing p by the vertex W[yj+i or r that was not chosen to be in S'commit if p G S, and 
then adding S'commit to it. 

We complete the proof of Lemma |9] by arguing that S" is a distance-2 transfer of the original tree 
instance {T,L,y). Note that, as |5| = J^verVv = Ei/eT ^^ " 1 " L^J > we have |5| = |S| + |5commit| = 
J^vevV'" ^^ required. It remains to verify Condition {i. :>) of Definition Isl 

Claim 10. We have >^ yuL\ 



u, 



uGU 



< Yl L{s) for all U C V{T). 

sGS:d{s,U)<2 



Proof of Claim. Consider the bipartite graph G with left-hand-side V{T), right-hand-side 5, and 
an edge between v G V{T) and s £ S if d{s, v) < 2. For simplicity, we slightly abuse notation and 
think of V(T) and S as disjoint sets. Moreover, let N{U) denote the neighbors of a subset U of 
vertices in this graph and let w : V{T) U S* — t- M be weights on the vertices defined by 



w{v) 



y,L{v) iiv(^V{T) 



L(v) 



iiveS 



With this notation, we can reformulate the condition of the claim as 



Yw{u)< Y^ w{s) 

u£U s&N{U) 



for all U OV(T). 



(2) 



To prove this, we shall prove a slightly stronger statement by verifying the condition separately on 
two biparite graphs Gr and G that correspond to Tr and T, respectively. We obtain Gr and G from 
G as follows (see also Figure^). First, add a vertex p to the left-hand-side by making a copy of 
r £ T and set 'w{p) = yp • L{p) and update w{r) = yrL{r) — ypL{p) = L{r) — ypL{p) > 0. Similarly, 
li p £ S then add a copy p of r € S" and set w{p) = L{p) and update w{r) = L{r) — L{p) > 0. 
Note that after these operations the vertices of both the left-hand-side and the right-hand-side can 
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naturally be partitioned into those that correspond to vertices in T^ and those that correspond to 
vertices in T. Graphs Gr and G are the subgraphs induced by these two partitions. 

Let us first verify that ([2]) holds for G. By construction, we have that the total weight w{U) of 
a subset U of V{T) is equal to YlueU yuL{u) and the total weight w{NiU)) of its neighborhood in 
G equals Ylis&s-d{s U)<2 ^i^)- Hence, ^ holds since ^ is a distance-2 transfer of T. 

We conclude the proof of the claim by verifying pi) for Gr- As both the left-hand-side and right- 
hand-side of Gr correspond to vertices in Tr that all are within distance 2 of each other, we have 
that Gr is a complete bipartite graph. The total weight of the left-hand-side is by construction 
Z^ueTr 2/"-^(^) ~ Vp^ip) ^iid the total weight of the right-hand-side is X^seT^nS -^(^) ~ ^ip)'^pes 
which equals J2seS ■ ^i^)- '^^^ claim now follows from ([I]), i.e., that ^^g^^ yuL{u) — ypL{p) < 
Ese5 ■ L{s). ""'"' " D 

The above claim completed the analysis of the algorithm for finding an integral distance-2 
transfer of a given tree instance and Lemma [9] follows. 

5 Better preprocessing for better algorithms 

In this section, we explore the possibility of a further improvement in the performance guarantee 
and integrality gap bounds via a better preprocessing. We demonstrate this by presenting a 6- 
approximation algorithm for the {0, Lj-case of our problem. Formally, this is the special case of 
the capacitated A;-center problem in which all the vertex capacities are either or L, for some 
integer L. Instances with this property will be called {0, L}-instances. 

It turns out that instances arising from the NP-hardness results, as well as the gap instances for 
the standard LP relaxation are all of this form, so this special case seems to capture the essential 
combinatorial difficulty of the capacitated problem. For these, we prove the following theorem: 

Theorem 11. There is a polynomial-time algorithm, achieving a 6-approximation for {0, L}-instances 
of the capacitated k-center problem. 

General framework revisited. Let us recall the preprocessing done by Cygan et al. jSj, ex- 
plained in Section[2| The idea is to guess the optimum (call the guess r), and consider an unweighted 
graph G<T- in which we place an edge between u, v if d{u, v) < r. If we then solve the LP on such 
a graph, the integrality gap is unbounded, as the following example shows. We have two groups of 
3 vertices, such that the distance within the groups is 1, and distance between the groups is some 
large C. Suppose the capacity of each vertex is 2, and A; = 3. Then the LP for the instance is 
feasible with r = 1, while the OPT is C. 

The trick to avoid this situation is to restrict to connected components of G<t- defined above, 
then for each i, determine the smallest ki for which the LP is feasible in component i, and finally 
check if ^^ ki < k. (If not, the guess r is too small). For connected graphs G<r, our main theorem 
shows that the integrality gap is at most 9. Cygan et al. p] gave a connected {0, L}-instance with 
integrality gap 7, i.e., the LPfc(G<T-) is feasible with r = 1, while OPT > 7. 

So in a nutshell, the steps above can be seen as coming up with a graph (in this case G<t) 
which has edges between u, v only if it is feasible to assign n to f and vice versa, solving the 

11 



LP on the connected components of this graph, and verifying that ^^ ki < k, as described. The 
aim of better preprocessing would then be to come up with a graph with even fewer edges, while 
still guaranteeing that the optimum assignment is preserved. Intuitively, this could produce more 
connected components, thus the ^^ ki < k condition now becomes stronger. 

For the {0, L}-case, we prove that an extremely simple additional preprocessing - namely re- 
moving edges between vertices u and v with L{u) = L{v) = - provably lowers the integrality gap. 
Our result is then the following. 

Theorem 12. Suppose G*^^ is a connected component after the two preprocessing steps above, and 
suppose LPk{G*^^) is feasible, for some k. Then there is an algorithm to compute a set of k vertices 
to open and an assignment of every vertex u to an open center v such that d{u, v) < 6, and the 
capacity constraints are satisfied. 

The preprocessing leads to additional structure in the instance which we then use carefully in 
our rounding procedure. The proof is presented in Appendix [B) A natural open question is whether 
such an approach can be applied to the general problem as well, improving our 9-approximation 
algorithm. 

6 Extensions to other problems and future directions 

Our techniques can be extended to obtain approximation algorithms for other problems. Ap- 
pendix |A] discusses two problems to which our techniques readily apply: first we study the capaci- 
tated A;-supplier problem - a variant of A:-center where the set of clients and facilities are specified 
separately - and give an 11-approximation algorithm. We then consider the budget generalization 
of the /c-center problem, where the general capacity problem is inapproximable but we give a 9- 
approximation algorithm when the capacities are uniform. We see this as further evidence that the 
simplicity of our approach helps in designing better algorithms also for other location problems. 

As our 9-approximation algorithm comes close to settling the integrality gap, it is natural to 
ask if our techniques can be used to obtain a tight result. Recall that our framework consists of 
first reducing the general problem to tree instances and then solving such instances. Since our 
algorithm for tree instances is optimal, any potential improvement must come from the reduction, 
and we raise this as an open problem. 

Finally, our preliminary results on additional preprocessing indicate that further investigation 
is necessary to understand if these techniques can help bring down the integrality gap to the tight 
factor of 3. More generally, we believe that it is important not only for capacitated /c-center but 
also for other problems, such as facility location and fc-median, to understand the power of lift-and- 
project methods (applied to potentially different formulations). For example, do they automatically 
capture these preprocessing steps and lead to stronger formulations? 

References 

[1] V. Arya, N. Garg, R. Khandekar, A. Meyerson, K. Munagala, and V. Pandit. Local search 
heuristics for /c-median and facility location problems. SIAM J. Comput., 33(3):544-562, 2004. 



12 



[2] M. Bansal, N. Garg, and N. Gupta. A 5-approximation for capacitated facility location. In 
ESA, pages 133-144, 2012. 

[3] Judit Bar-Ilan, Guy Kortsarz, and David Peleg. How to allocate network centers. J. Algorithms, 
15(3):385-415, 1993. 

[4] J. Byrka. An optimal bifactor approximation algorithm for the metric uncapacitated facility 
location problem. In APPROX-RANDOM, pages 29-43, 2007. 

[5] M. Charikar and S. Guha. Improved combinatorial algorithms for facility location problems. 
SI AM J. Comput, 34(4):803-824, 2005. 

[6] M. Charikar, S. Guha, E. Tardos, and D. B. Shmoys. A constant-factor approximation algo- 
rithm for the k-median problem. J. Comput. Syst. Sci., 65(1):129-149, 2002. 

[7] F. A. Chudak and D. P. Williamson. Improved approximation algorithms for capacitated 
facility location problems. Math. Program., 102(2):207-222, 2005. 

[8] J. Chuzhoy and Y. Rabani. Approximating fc-median with non-uniform capacities. In SODA, 
pages 952-958, 2005. 

[9] M. Cygan, M. Hajiaghayi, and S. Khuller. LP rounding for k-centers with non-uniform hard 
capacities. In FOCS, pages 273-282, 2012. 

[10] T. F. Gonzalez. Clustering to minimize the maximum intercluster distance. Theor. Comput. 
Sci., 38:293-306, 1985. 

[11] S. Guha and S. Khuller. Greedy strikes back: Improved facility location algorithms. J. 
Algorithms, 31(l):228-248, 1999. 

[12] P. Hall. On representatives of subsets. Journal of the London Mathematical Society, 10:26-30, 
1935. 

[13] D. S. Hochbaum and D. B. Shmoys. A best possible heuristic for the /c-center problem. 
Mathematics of Operations Research, 10:180-184, 1985. 

[14] K. Jain, M. Mahdian, and A. Saberi. A new greedy approach for facility location problems. 
In STOC, pages 731-740, 2002. 

[15] K. Jain and V. V. Vazirani. Approximation algorithms for metric facility location and /c-median 
problems using the primal-dual schema and lagrangian relaxation. J. ACM, 48(2):274-296, 
2001. 

[16] S. Khuller and Y. J. Sussmann. The capacitated /c-center problem. SI AM J. Discrete Math., 
13(3):403-418, 2000. 

[17] M. R. Korupolu, C. Greg Plaxton, and R. Rajaraman. Analysis of a local search heuristic for 
facility location problems. J. Algorithms, 37(1): 146-188, 2000. 

[18] R. Levi, D. B. Shmoys, and C. Swamy. LP-based approximation algorithms for capacitated 
facility location. In IPCO, pages 206-218, 2004. 

13 



[19] S. Li. A 1.488 approximation algorithm for the uncapacitated facility location problem. In 
ICALP (2), pages 77-88, 2011. 

[20] S. Li and O. Svensson. Approximating fc-median problem via pseudo-approximation. In STOC, 
2013. To appear. 

[21] M. Pal, E. Tardos, and T. Wexler. Facility location with nonuniform hard capacities. In FOCS, 
pages 329-338, 2001. 

[22] D. B. Shmoys, E. Tardos, and K. Aardal. Approximation algorithms for facility location 
problems (extended abstract). In STOC, pages 265-274, 1997. 

[23] D. P. Williamson and D. B. Shmoys. The Design of Approximation Algorithms. Cambridge 
university press, 2011. 

A Extensions to other problems 

We believe that the simplicity of our approach could be key to generalizing it to other location 
problems with capacity constraints. In this section, we see how our ideas readily apply to two 
problems. 

A.l Capacitated A:-supplier 

In this subsection, we present a 11-approximation algorithm for the capacitated fc-supplier problem. 
This problem is a generalization of the capacitated A;-center problem in which some vertices are 
designated clients and some facilities. We can only open k of the facilities, and the aim is to serve 
the clients (facilities do not have to be served). 

Let us denote by C and T the set of clients and facilities respectively. For this version, we prove 
the following. 

Theorem 13. There exists a polynomial time 11-approximation algorithm, for the capacitated k- 
supplier problem. 

The algorithm proceeds along the lines of our main result. We first guess the optimum r, and 
restrict to the bipartite graph G on vertex sets C,J-', with an edge between u & C and v G J-" iff 
d{u, v) < T. We then divide this into connected components and work with them separately, as 
before. Thus in what follows, let us assume that G as defined above is connected, and LPk{G) is 
feasible. Note that this is a slightly different LP, where y-variables exist only for facilities, and the 
constraints '^Zu-U v)eE ^™ ~ -*- exist only for the clients. 

The main difference in this variant is in the clustering step. This now works as follows. Start 
with a client u G C, and include all of N~^{u) in the cluster C„. Now as long as possible, do the 
following: pick a client u £ C which is at a distance > 2 from the midpoints of all the clusters so 
far, but is distance precisely 4 from some cluster midpoint; include all of N~^(u) into the cluster 
Cu (there will not be an overlap with other clusters because of the distance condition). 

When the procedure ends, we will be left with a bunch of clients at distance 2 from some 
cluster midpoints, and some facilities at distance 3 from some cluster midpoints (and nothing else, 
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by connectivity properties). We move them to the closest cluster (breaking ties arbitrarily). Now 
the procedure satisfies the following conditions: 

1. Each cluster has its ?/- values adding up to > 1 (indeed, the neighborhood of the cluster 
midpoint has total y- value > 1, as is required in the tree reduction). 

2. The graph of clusters, in which we place an edge if the midpoints are at distance precisely 4, 
is connected. 

These properties ensure that we can perform precisely the same reduction to tree instances, 
however we have a variant of Lemma pj an r-transfer to the tree instance now implies a (4r + 3) 
approximation algorithm for the client/facility problem. This is because adjacent cluster midpoints 
are at a distance 4, and hence the distance in G between two vertices a^ and o^ (as in the reduction) 
which have distance r in the tree instance, is now 4r. The rest of the proof carries over verbatim, 
and we obtain a reduction to tree instances with the above guarantee. 



This proves Theorem 13, because for tree instances, we can use our algorithm which gives 



r = 2. D 

A. 2 Budgeted version with uniform capacities 

The budgeted center problem is a weighted generalization of the /c-center problem: in the fc-center 
problem, opening a center incurs the uniform cost of one and there is a budget of k on the total 
opening cost; on the other hand, in the budgeted center problem, the opening costs are given by 
C : y — )• M+ that is a part of the input along with the total budget B G M+. It is NP-hard to 
approximate this problem to any approximation ratio if the vertices have general capacity; this can 
be shown by a straightforward reduction from the Knapsack Problem. However, for the uniform 
capacity, Khuller and Sussmann [TB], using the technique of Bar-Ilan, Kortsarz, and Peleg [3], gives 
a 13-approximation algorithm. In this subsection, we present a 9-approximation algorithm for the 
budgeted center problem with uniform capacities. We note that it is easy to extend this result to 
the {0, L}-case as well. 

Let Lq E N be the uniform capacity. Following is the key lemma of our analysis: 

Lemma 14. Suppose there exists a polynomial-time algorithm that finds an integral distance-r 
transfer of a tree instance. Then there exists an algorithm that, given a connected graph G = {V, E), 
the constant capacity function L : V ^ {-^o}; cind k £ N for which LPfc(G) has a feasible solution 
{x,y), in addition to the opening costs C : V ^ M+, finds an integral distance-{2>r + 2) transfer y' 
ofiG,L,y) satisfying E„eyC'(^)y^ < E„ey C'(^)yt'- 

Our 9-approximation algorithm follows from Lemma [14} 

Theorem 15. There exists a 9-approximation algorithm for the budgeted center problem with 
uniform capacities. 

Proof. Let OPT denote the optimal solution value. As in Lemma [2} our algorithm makes a guess 
r at the optimal solution value and tries to decide if r < OPT. In this problem again, we consider 
the graph G<t representing the admissible assignments. Consider the connected components of 
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G<r', for each component Gi, we will compute a lower bound Bi on the minimum budget necessary 
to have a feasible solution to the subproblem induced by Gi. Observe that, if r > OPT, an 
optimal solution assigns every vertex to a center that is in the same connected component. Thus, 
if X^j -Bj > B, we can certify that r < OPT. Bi is determined by solving LPfc^(Gj), but with an 
objective of minimizing the opening cost Y^veG ^i'^)yv rather than as a feasibility LP with no 
objective function; ki is chosen by trying all integers from 1 to |y(Gj)| and selecting the one that 
gives the smallest opening cost. If we failed to certify r < OPT, this means Yli-^i — ^- Now for 
each Gi, Lemmas |4] and 14 lets us find a set of vertices to open for which there exists an assignment 



of every vertex to an open center that is within the distance of 3r + 3, and the total opening cost 
of this set is no greater than Bi. The union of these sets is the desired solution from the triangle 
inequality. Recall that r can be taken as two, from Lemma lOl D 



Proof of Lemma 14 - We invoke the rounding procedure given in Section [3| but with the "fake" 



capacity function L defined as L{v) := Cmax — C{v), where Cmax := 1 + max^gy G{v). The output 
vector y' is an integral distance-(3r + 2) transfer of {G, L, y) from LemmalTJ Since y' is a distance- 



(3r + 2) transfer, we have Ylvf^vy^ ~ YLv<^vy'v ~ ^' ^^^ by taking U = V in Condition (c :>) 
of Definition JSJ we also have Yjy^y ^'<")y'v ^ Yjv&v ^^'")y^ ■ Since Ei,ey-^(^)y^ = ^max • k - 
Y.vevC{v)yy and Y.v&v ^i'")yv = C'max • k - E^,ey C'(«)yt'> this implies that Ylv&vC(^)yv < 
E^,eyC'(^')y^,. 

On the other hand, one can see that the decisions made by our rounding procedure purely 
depend on the relative ordering of capacities, rather than their actual values. Hence, the complete 
"execution history" of the rounding procedure with L could also be interpreted as a valid execution 
history with the true capacity function L as well: if the procedure is executed with L, every 
comparison of capacities will always be a tie since L is a constant function, and we can break them 
so that it will be consistent with the ordering of L. Therefore, it is possible that our rounding 
algorithm outputs y' when it is run with L, and from Lemma ul y' is an integral distance-(3r + 2) 
transfer of (G, L,y). D 

B 6-approximation algorithm for the {0, L}-case 



In this section, we present the 6-approximation algorithm for the {0, L}-case by proving Theorem 12 
We call a vertex a 0-node if its capacity is zero; an L-node otherwise. Let Vl denote the set of 
L-nodes. N^'^{v) denotes N'^{v) n V^. Let G = {V, E) denote the connected component G<^ after 
the two preprocessing steps described in Section [5] 

Recall that the 9-approximation algorithm rounds the opening variables of the LP solution 
"locally": it considers the tree of clusters in the bottom- up fashion, and for each subtree T„, it 
opens \_y{Tu)\ centers while deferring the decision of whether to open one additional center to the 
later subinstances. Our 6-relaxed decision procedure also operates as a bottom-up local rounding 
procedure, but in this case, our preprocessing ensures that a path from (the midpoint of) a child 
cluster to (the midpoint of) the parent does not contain consecutive 0-nodes; this implies that L- 
nodes are very well "dispersed" throughout the graph, permitting local rounding to be performed 
at a finer granularity within closer proximity. In fact, even without such change in the granularity 
of rounding, a careful choice of m^ alone with the original rounding algorithm is sufficient to give 
a 8-relaxed decision procedure. 
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Further improvements are facilitated by a better clustering. The clustering algorithm of Khuller 
and Sussman [T6] that is used by our 9-approximation algorithm finds cluster midpoints that are 
connected by length-three paths. This is in order to guarantee that y(C^) > 1 for each cluster 
Cv, by ensuring N~^{v) C Cv However, in a {0, L}-instance, N^'^{v) Q Cv is sufficient to yield 
y{N~^{v)r]Cv) > 1, and hence we can choose two vertices that are at distance 2 as cluster midpoints 
as long as all their common neighbors are 0-nodes. This observation leads to an improved clustering 
where some parent and child can be closer. 

Clustering algorithm. Our clustering algorithm identifies clusters one by one, and each time 
a new cluster midpoint v is identified, N^'^{v) is allotted to the new cluster Cy. The next cluster 
midpoint is always chosen at distance 2 from the set of already allotted vertices to ensure that 
N^~^{v) of each cluster are disjoint. In what follows, Vaiiotted denotes the set of vertices that has 
been already allotted to a cluster by the algorithm; for u G VaUotted; a{u) denotes the midpoint 
of the cluster that u is allotted to: u £ Ca(u)'i finally, dist(w) denotes the shortest distance from 
Vaiiotted to v: dist(u) := min„ev;iio,ted dG{u,v). 

Algorithm [T] shows our clustering algorithm. In addition to identifying the clusters, our algo- 
rithm chooses p{v) G C^ for each cluster C^,, on which the opening of one will be aggregated. Also, 
for each non-root cluster C^, the algorithm finds a vertex in the parent cluster through which v 
is connected to the parent cluster and call it 'iti{v). At the end of the algorithm, we assign every 
unallotted L-node to a nearby cluster; the algorithm annotates each of these vertices with it2{v), 
where 7r2(f) denotes the vertex through which v is connected to a{v). 

Algorithm 1 Clustering algorithm. 

1: V;||otted ^ 

2: Let V be an arbitrary L-node 

3: Create a new cluster centered at v: C^ <— N^'^[v); Faiiotted ^ Knotted U Cy 

4: p{v) ^ V 

5: while 3w G Vl di'\st{w) > 2 do 

6: Let w G y be an arbitrary vertex with dist(f ) = 2 

7: u* G argmin^g y^,,^^^^^ dG{u,v) 

8: Create a new cluster centered at v, as a child of Cq,(-^.): 

a ^ {v} U N^+iv); ^allotted ^ 't^allotted U C^ 
9: 71"! (i;) •<— U* 

10: if V is an L-node then p(v) ^ v else p{v) is arbitrarily chosen from N^{v) D N^(u*) 

11- Kktted ^ 'Allotted 

12: for all veVL\ Allotted do 

13: Let u be an arbitrary vertex in V^Ji^^^g^ Pi N^{v) 

14: C^(„) ^ C„(„) U {v} 

15: 'n'2{v) ^ u 

Lemma 16. Algorithm^ is well-defined, and its output satisfies the following. ■ 
(i) N^^{v) C Cy for every C^, and C^ 's are disjoint; 
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(ii) every L-node is allotted to some cluster, and a 0-node is allotted only when it becomes a cluster 
midpoint; 

(Hi) p{v) € N^^{v) for every C^; 

(iv) vri('u), when defined, is in N^^{x) for some Cx,' vr2(f), when defined, is in N^^{y) for some 

j N^+{v) U {u I 7r2(n) G N^+{v)}, if v is an L-node; 

\{v}UN^~^{v)U{u\tt2{u) £ N^^{v)}, if V is a 0-node. 

Proof Since LPfc(G) is feasible, Vl / and v can be chosen at Step[2| At Step|6| as there exists 
w G Vl with dist(i(;) > 2, there exists a vertex v with dist(ti) = 2, for example the one that appears 
on a path of length dist(i(;) from Vaibtted to w. Note that v £ V may be a 0-node or an L-node. At 



Step 10, dciu* ,v) = 2 from the choice of u* and hence N'^{v) n N^[u*) is nonempty. When the 
while loop terminates, dist(ii;) < 1 for every w G Vl', thus, v at Step [iS] satisfies dist(w) = 1 and 
therefore u can be chosen. The algorithm is well-defined. 

Each time a new cluster C^ is created, N^~^{v) is added to C^,: N^~^{v) C C„. The only 
two cases in which we create a new cluster Cy is when it is the first cluster created, and when 
6\sX.{v) = 2. In the latter case, since dist(f ) = 2, N^+{v) PI Vaibtted = and therefore {v] U N^+{v) 
is disjoint from ^allotted, the set of already allotted vertices. Thus, at the beginning of Step[Tl} C^s 
are disjoint. No new clusters are created in the rest of the algorithm and only the L- nodes that 
has not been allotted are added to exactly one of the existing clusters. Hence, Property ^ holds. 

Property dnl) is easily verified, since Steps [12 15 ensure that every L-node is allotted, and the 
only case a 0-node is allotted is at Step [8| where the cluster midpoint v is allotted. 

At Ste p [lOJ if f is a 0-node, N^{v) C Vl since every edge is incident to at least one L-node; 
Property (|iii[) follows from this observation. 



Until Step 11 of the algorithm, vertices are allotted only when it is a cluster midpoint or in 
N^'^{v) for some cluster midpoint v. Thus, u* G Vaiiotted chosen at Step [t] is either a cluster 
midpoint or in N '^{v) for some Cv Suppose u* is a cluster midpoint. If u* is an L-node, then 
u* G N^'^{u*); suppose u* is a 0-node. As dG{u*,v) = 2 from the choice of u*, there exists a vertex 
z that is in both N~^{u*) and N~^{v). z is an L-node since n* is a 0-node. Thus z is in A^ ^(u*) 
and has to be in Vaiiottedi contradicting dist(v) = 2. Hence, in any case, 7ri(t;) G A^+(x) for some 



Cx- At Step 13 u G V^g^iotted ^^"^ hence either li is a cluster midpoint or li G N^^{y) for some Cy. 
If n is a cluster midpoint, v G N^^{u), contradicting v ^ V^g^i^^^g^. Property ([iv]) is verified. 



At the beginning of Step 11 , for every C^, Cy = {v} U N^^{v) from construction and it can be 



only augmented in the rest of the algorithm. When v is added to a cluster at Step [I4| it is added 
to Ca(TT2{v)) aiid hence 

_ j N^+{v) U {u I 7r2(n) G {v} U N^+{v)}, if u is a L-node; 

\{v}UN^+{v)U{u\7r2{u) e{v}UN^+{v)}, if u is a 0-node. 

Now Property dyl) follows from Property myl). D 
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Observation 17. For every non-root cluster C^, the distance between 7ri(t') and p{v) is 

{1, if V is a 0-node; 
2, otherwise. 

Proof. Note that 7ri(f) and v are at distance 2 as can be seen from Step [T] of Algorithm [l] thus, if 
V is an L-node, p{v) = v and the distance between 7ri(t') and p{v) is two. If u is a 0-node, p{v) is 
chosen from N~^ {tti{v)) at Step 10 of Algorithmill D 



Rounding opening variables. Our algorithm will gradually round the opening variables y, 
starting from the original LP solution, until they become integral. This process will be described 
in terms of opening movements, where each movement specifies how much opening is moved from 
which L-node to which L-node. Since the L-nodes have the same capacities, if we show that a 
set of opening movements makes the opening variables integral while no opening is moved by the 
net distance of more than r, this implies that the resulting set of opening variables is an integral 
distance-r transfer. 

Our rounding procedure begins with changing yp(^) of every cluster C^ to one: for each cluster 
Cy, we increase yp^ until it reaches one, while simultaneously decreasing the opening variable of 
a vertex in N^^{v) by the same amount. This initial aggregation can be interpreted as opening 
movements, and keeps the budget constraint J2vev Vv = k satisfied. 

Observation 18. For each cluster Cy, the initial aggregation can be implemented by a set of 
opening movements within the distance of 

(l, ifp{v) = v; 
I 2, otherwise. 



Proof. N '^{vYs are disjoint from Lemma 16, and y{N ~^{v)) > 1 from the LP constraints; hence, 
ypiy) can be made 1 via movements from N^{v). Note that p{v) G N^~^{v) in any case. D 

After the initial aggregation, the procedure considers each cluster Cy in the bottom-up order 
and make the opening variables of every vertex in Cy \ {p{v)} integral, using movements of distance 
5 or smaller; p{v) is propagated to the parent cluster, to be taken into account when that cluster 
is rounded. Precisely, the rounding procedure for Cy rounds the opening variables of ly := Vl n 
{Cy \ {p{v)} U {p{u) I 7ri(u) G Cy}), i.e., the set of L-nodes that is either propagated from a child 
cluster or originally in Cy , except the vertex to be propagated from Cy . 

Algorithm [2] shows the procedure. First it recursively processes the children clusters, and then 
constructs a family of vertex sets {Xu}ueN'-+(v) indexed by N^~^{v). For u 7^ p{v), Xu consists of u 
itself, vertices propagated from the children clusters that are connected through u, and the vertices 
in Cy that are connected to v through u: Xu ■= {u} U {p{w) \ Tri{w) = u} U {w \ Tr2{w) = u}. 
Xpfy\ is similarly defined, except that it does not contain p{v). Now for every u G N^~^{v), we 
locally round Xu- we choose a set Wy of the vertices to be opened, and move the openings of the 
other vertices to the vertices in Wy. Note that LoCALRouND(T4oOpern VmoveFromi; VmoveFmmi) is a 
procedure that increases the opening variables of the vertices in KoOpen to one, while decreasing the 
opening variables of FmoveFromi (and VmoveFrom2 if ^moveFromi IS used up) to match the increase. Wu 
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is chosen as a subset of X„, but we avoid choosing u G N^^{v) whenever possible. After these local 
roundings, each Xu may still have some non-integral opening variables remaining; we choose a set 
F C N^~^{v) \ {p{v)} to accomodate these openings. Finally, if there still remains some fraction, 
we choose one last center w* , and open it using the opening movements from IsteJI!] and {p{v)}. 
Note that UpM, therefore, may become less than one at the termination of RouND(f). 

Algorithm 2 Rounding algorithm. 



procedure Round(w) 

for all children clusters C,„ do Round (t(;) 



10 
11 
12 
13 
14 
15 



^p(v) ^ {P{w) I TTliw) = p{v)} U {W I TT2{w) = p{v)} 

Xu ^ {u} U {p{w) I 'Ki{w) =u}U{w\ TT2{w) = u} for all u £ N^+{v) \ {p{v)} 
for all u G N^+{v) do 

Choose [y(X„)J vertices from X^, call it Wu (avoid choosing u unless \Xu\ = y{Xu)) 

7: L0CALR0UND(PF„, Xu, 0) 

Let /stetpbe the set of vertices in U„gjv^+(»;)^M that have non-integral opening variables 

F:={u(^N^+{v)\{p[v)}\yu<l} 

Choose [X^„gj 1-1 yuj vertices from F; call it Wp 

LOCALROUND(VF'f, Iste^\Xp(^u), 4te,|8]n Xp(^)) 

Let /steJi2]be the set of vertices in U„g^L+(^)X„ that have non-integral opening variables 
if hte^ / then 

Choose w* from F \ Wp li F\ Wp 7^ 0; otherwise choose from /steJi2] 

LocalRound({u;*}, IsteiQ {p{v)}) 



16: procedure LOCALROUND(VtoOpen, VmoveFroml, KioveFrom2) 

17: while 3u G VtoOpen y« < 1 do 

18: Choose a vertex w with nonzero opening from VmoveFromi \ KoOpen; 

if there exists none, choose from FmoveFrom2 \ VtoOpen 
19: A -^ min(l — y„, y^); increase y„ by A and decrease y^ by A 

Lemma 19. Suppose that y^^y) = 1 before Step^ of RouND(ti). Then Steps^ 15 of RouND(f) 
make the opening variables of ly integral, and this can be implemented by a set of opening movements 
within Iy[j{p{v)} , with no incoming movements to p{v). The maximum distance of these movements 
is five taking the initial aggregation into account. 

Proof. Note that, from Properties (iii), ([iv]), and Q of Lemma 16, {Xu}ueN'^+(v) forms a partition 
of lu- Thus it suffices to verify that the opening variables of each Xu becomes integral. Also note 
that Xu^Vl- 

At Step |6] of the algorithm, we have y{Xu) < \Xu\ from y < 1; hence Wu can be successfully 
chosen. After StepJTj Xu may still have some non-integral opening variables, but their total opening 
is given by r^ := y{Xu) — [y{Xu)\ < 1. Moreover, when r^ > 0, we have u ^ Wu and therefore 



yu < 1. Thus, Eue/ste^g^" = ^p(^) + T,ueNL+(v)\{p{v)}'^^^ < 1 + l-^l at Step [lo] and therefore Wp 
can be chosen as well. Note that (/sted8]\ ^p(?))) U (^SteJsin ^p(i,)) = -^SteJs] and hence Step 18 of 
LocalRound called from Step[TT]will always succeed. After Step|ll[ the total non-integral opening 
variables in ly will become strictly smaller than one. If /steJTSl = 0, we are done. Otherwise, u 
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can be successfully chosen since /steJT2l 7^ 0) and Step 15 will make the opening variables of ly 



completely integral, while making yp(^) smaller than one. Note that yp{v) = 1 before Step 15 



Now it remains to verify that this rounding can be realized in terms of opening movements 
within the distance of five. When x G X^, one of the following holds: (i) x = u, (ii) x = p{'w) 
and 7ri(u;) = u, or (iii) it2{x) = u. In Case (ii), dG{x,u) < 2 from Observation 17 In Case (iii), 
dG{x,u) = 1 as can be seen from Step 13 of Algorithm [I] Thus, for any x G X„, x is within the 
distance of 2 from u. Note that the opening at x G X^ has been moved from N^'^{w) if x = p{w); 



otherwise, it originates from x itself. From Observations [T7| and 18 the opening at x, in Case (ii) 



originates from vertices within the distance of three from u; in the other cases, it is from x itself and 
therefore within the distance of one. Thus, any movements resulting from Step [7] of Algorithm [2] 
moves opening that originally comes from vertices within the distance of three from n to a vertex 
within the distance of two from u; the maximum distance of these movements therefore is five. 

Since the opening at x G X^ originates from vertices within the distance of three from u, it is 
within the distance of four from v. On the other hand, every vertex in F is within the distance of 



one from v; therefore, the maximum distance of movements resulting from Step 11 also is five. 



Suppose w* is chosen from F \ Wp at Step 14, Then w* is within the distance of one from v; as 
observed earlier, the opening at x G /ste JT2| originates from vertices within the distance of four from 
V. p{v) is within the distance of one from v, and its opening originates from vertices within the 



distance of two from p{v) (see Observation 18); hence, the opening at p{v) originates from vertices 
within the distance of three from v. Thus, in this case, any movements resulting from Step 15 
moves opening that originates from vertices within the distance of four from f to a vertex within 
the distance of one from v; the maximum distance of these movements therefore is five. 



Suppose F \ Wp 



In this case, Enelst, 



\Xp^v) ^" 



< \F\ 



\Wf\ and (Wf n hte^ C 
Therefore, 
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(/steJ8]\ -'^p(i.) ) ; hence, IsteJ8]\-^p(i>) is used up during LocalRound called from Step 
we have /steJig] ^ Xp{v)- ^s observed earlier, w* G Xp(t,) is within the distance of two from p{v)] 
the opening at x G ^p(^) is from vertices within the distance of three from p{v). The opening at 
p{v) is from vertices within the distance of two from p(v 



as was seen in Observation 18 
the maximum distance of movements resulting from Step [15] is five in this case as well 



Thus, 

D 



Proof of Theorem\lS\ Let Cr be the root cluster, and we excute RouND(r) on the LP solution. 



From Lemma 19, RouND(r) outputs a set of opening variables that can be realized by a set 
of opening movements of distance five or smaller: note that yp(v\ = 1 before Step p^ of Round(u), 
since we process the clusters in the bottom- up order. As every vertex in V^ \ {jv")} is in /„ for 
some cluster Cv, their opening variables are made integral. Since X^^gyl/t; = k, Vpi^r) is also made 
integral, and the opening movements to p{r) during the initial aggregation were within the distance 
of one. Thus the output set of open vertices is an integral distance-5 transfer. 

Now Lemma [4] completes the proof. D 
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